AITopics | interactive retrieval

Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

Neural Information Processing SystemsDec-25-2025, 08:07:25 GMT

This paper explores the task of interactive image retrieval using natural language queries, where a user progressively provides input queries to refine a set of retrieval results. Moreover, our work explores this problem in the context of complex image scenes containing multiple objects. We propose Drill-down, an effective framework for encoding multiple queries with an efficient compact state representation that significantly extends current methods for single-round image retrieval. We show that using multiple rounds of natural language queries as input can be surprisingly effective to find arbitrarily specific images of complex scenes. Furthermore, we find that existing image datasets with textual captions can provide a surprisingly effective form of weak supervision for this task. We compare our method with existing sequential encoding and embedding networks, demonstrating superior performance on two proposed benchmarks: automatic image retrieval on a simulated scenario that uses region captions as queries, and interactive image retrieval using real queries from human evaluators.

drill-down, interactive retrieval, query, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

Reviews: Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

Neural Information Processing SystemsJan-23-2025, 09:55:46 GMT

The main problem for me is that the paper promises a very real scenario (Figure 1) of how a user can refine search by using a sequence of refined queries. However, majority of the model design and evaluation (except section 4.2) is performed with dense region captions that have almost no sequential nature. While this is partially a strength as no additional labels are required, the method seems suited especially towards such disconnected queries -- there is space for M disconnected queries and only then updates are required. This would provide a deeper understanding of when the proposed method works better. In Figure 1, the user queries seem very natural, but the simulated queries in Figure 1 are not.

natural language query, query, user query, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Reviews: Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

Neural Information Processing SystemsJan-23-2025, 09:55:36 GMT

This paper investigates the problem of multi-round natural language image retrieval, using annotations from the Visual Genome dataset for training and evaluation. After feedback and reviewer discussion, this paper received final ratings of 6, 6 and 7. Despite some concerns about the use of non-sequential annotation data for a sequential task, the reviewers found the proposed model to be generally sound and the experimental evaluation convincing, and the AC agrees. However, we would encourage the authors to pay close attention to the reviewer feedback when preparing the final paper version. In particular, the author feedback committed to including the additional baselines requested by R1, so these should be included in the final version as promised.

complex scene, interactive retrieval, natural language query, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

Neural Information Processing SystemsOct-9-2024, 23:07:24 GMT

This paper explores the task of interactive image retrieval using natural language queries, where a user progressively provides input queries to refine a set of retrieval results. Moreover, our work explores this problem in the context of complex image scenes containing multiple objects. We propose Drill-down, an effective framework for encoding multiple queries with an efficient compact state representation that significantly extends current methods for single-round image retrieval. We show that using multiple rounds of natural language queries as input can be surprisingly effective to find arbitrarily specific images of complex scenes. Furthermore, we find that existing image datasets with textual captions can provide a surprisingly effective form of weak supervision for this task. We compare our method with existing sequential encoding and embedding networks, demonstrating superior performance on two proposed benchmarks: automatic image retrieval on a simulated scenario that uses region captions as queries, and interactive image retrieval using real queries from human evaluators.

image retrieval, natural language query, query, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Balancing Reinforcement Learning Training Experiences in Interactive Information Retrieval

Chen, Limin, Tang, Zhiwen, Yang, Grace Hui

arXiv.org Artificial IntelligenceJun-4-2020

Interactive Information Retrieval (IIR) and Reinforcement Learning (RL) share many commonalities, including an agent who learns while interacts, a long-term and complex goal, and an algorithm that explores and adapts. To successfully apply RL methods to IIR, one challenge is to obtain sufficient relevance labels to train the RL agents, which are infamously known as sample inefficient. However, in a text corpus annotated for a given query, it is not the relevant documents but the irrelevant documents that predominate. This would cause very unbalanced training experiences for the agent and prevent it from learning any policy that is effective. Our paper addresses this issue by using domain randomization to synthesize more relevant documents for the training. Our experimental results on the Text REtrieval Conference (TREC) Dynamic Domain (DD) 2017 Track show that the proposed method is able to boost an RL agent's learning effectiveness by 22\% in dealing with unseen situations.

information retrieval, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2006.03185

Country:

Asia > China (0.05)
Asia > Myanmar > Tanintharyi Region > Dawei (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.87)

Add feedback

Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

Tan, Fuwen, Cascante-Bonilla, Paola, Guo, Xiaoxiao, Wu, Hui, Feng, Song, Ordonez, Vicente

Neural Information Processing SystemsMar-18-2020, 21:31:24 GMT

This paper explores the task of interactive image retrieval using natural language queries, where a user progressively provides input queries to refine a set of retrieval results. Moreover, our work explores this problem in the context of complex image scenes containing multiple objects. We propose Drill-down, an effective framework for encoding multiple queries with an efficient compact state representation that significantly extends current methods for single-round image retrieval. We show that using multiple rounds of natural language queries as input can be surprisingly effective to find arbitrarily specific images of complex scenes. Furthermore, we find that existing image datasets with textual captions can provide a surprisingly effective form of weak supervision for this task.

image retrieval, natural language query, query, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Filters

Collaborating Authors

interactive retrieval

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

Reviews: Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

Reviews: Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

Balancing Reinforcement Learning Training Experiences in Interactive Information Retrieval

Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries